The R language has extensive graphical capabilities.
Graphics in R may be created by many different methods including base graphics and more advanced plotting packages such as lattice.
The ggplot2 package was created by Hadley Wickham and provides a intuitive plotting system to rapidly generate publication quality graphics.
ggplot2 builds on the concept of the “Grammar of Graphics” (Wilkinson 2005, Bertin 1983) which describes a consistent syntax for the construction of a wide range of complex graphics by a concise description of their components.
The structured syntax and high level of abstraction used by ggplot2 should allow for the user to concentrate on the visualisations instead of creating the underlying code.
On top of this central philosophy ggplot2 has:
Overview of example code for the ggplot2 scatter plot.
ggplot(data = <default data set>,
aes(x = <default x axis variable>,
y = <default y axis variable>,
... <other default aesthetic mappings>),
... <other plot defaults>) +
geom_scatter(aes(size = <size variable for this geom>,
... <other aesthetic mappings>),
data = <data for this point geom>,
stat = <statistic string or function>,
position = <position string or function>,
color = <"fixed color specification">,
<other arguments, possibly passed to the _stat_ function) +
scale_<aesthetic>_<type>(name = <"scale label">,
breaks = <where to put tick marks>,
labels = <labels for tick marks>,
... <other options for the scale>) +
ggtitle("Graphics/Plot")+
xlab("Weight")+
ylab("Height")+
theme(plot.title = element_text(colour = "gray"),
... <other theme elements>)
library(ggplot2)
library(dplyr)
anno <- read.csv("../annotate-and-filter/myanno.hg19_multianno.joined.csv")
anno <- tbl_df(anno)
anno
## Source: local data frame [77,702 x 32]
##
## Chr Start End Ref Alt Func.refGene
## (fctr) (int) (int) (fctr) (fctr) (fctr)
## 1 1 10181 10181 A C intergenic
## 2 1 17222 17222 A G ncRNA_intronic
## 3 1 74570 74570 T A intergenic
## 4 1 124583 124583 C T intergenic
## 5 1 231455 231455 T A intergenic
## 6 1 234432 234432 A G intergenic
## 7 1 234556 234556 C T intergenic
## 8 1 234760 234760 A T intergenic
## 9 1 237545 237545 C A intergenic
## 10 1 245264 245264 A G intergenic
## .. ... ... ... ... ... ...
## Variables not shown: Gene.refGene (fctr), GeneDetail.refGene (fctr),
## ExonicFunc.refGene (fctr), AAChange.refGene (fctr), cytoBand (fctr),
## gwasCatalog (fctr), snp129 (fctr), X1000g2014oct_eur (dbl),
## X1000g2014oct_afr (dbl), CLINSIG (fctr), CLNDBN (fctr), CLNACC (fctr),
## CLNDSDB (fctr), CLNDSDBID (fctr), DP (int), MP (dbl), GP (dbl), TG
## (fctr), TP (dbl), SG (fctr), SP (dbl), DS (fctr), GT.NORMAL (fctr),
## GT.TUMOUR (fctr), PM.NORMAL (dbl), PM.TUMOUR (dbl)
As seen above, in order to produce a ggplot2 graph we need a minimum of:-
In the code below we define the data as our cleaned patients data frame.
pcPlot <- ggplot(data=anno)
class(pcPlot)
## [1] "gg" "ggplot"
pcPlot$data[1:4,]
## Source: local data frame [4 x 32]
##
## Chr Start End Ref Alt Func.refGene Gene.refGene
## (fctr) (int) (int) (fctr) (fctr) (fctr) (fctr)
## 1 1 10181 10181 A C intergenic NONE,DDX11L1
## 2 1 17222 17222 A G ncRNA_intronic WASH7P
## 3 1 74570 74570 T A intergenic OR4F5,LOC729737
## 4 1 124583 124583 C T intergenic OR4F5,LOC729737
## Variables not shown: GeneDetail.refGene (fctr), ExonicFunc.refGene (fctr),
## AAChange.refGene (fctr), cytoBand (fctr), gwasCatalog (fctr), snp129
## (fctr), X1000g2014oct_eur (dbl), X1000g2014oct_afr (dbl), CLINSIG
## (fctr), CLNDBN (fctr), CLNACC (fctr), CLNDSDB (fctr), CLNDSDBID (fctr),
## DP (int), MP (dbl), GP (dbl), TG (fctr), TP (dbl), SG (fctr), SP (dbl),
## DS (fctr), GT.NORMAL (fctr), GT.TUMOUR (fctr), PM.NORMAL (dbl),
## PM.TUMOUR (dbl)
Now we can see that we have gg/ggplot object (pcPlot) and in this the data has been defined.
Important information on how to map the data to the visual properties (aesthetics) of the plot as well as what type of plot to use (geom) have however yet to specified.
pcPlot$mapping
## * NULL ->
pcPlot$theme
## list()
pcPlot$layers
## list()
The information to map the data to the plot can be added now using the aes() function.
pcPlot <- ggplot(data=anno)
pcPlot <- pcPlot+aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr)
pcPlot$mapping
## $x
## X1000g2014oct_eur
##
## $y
## X1000g2014oct_afr
pcPlot$theme
## list()
pcPlot$layers
## list()
But we are still missing the final component of our plot, the type of plot to use (geom).
Below the geom_point function is used to specify a point plot, a scatter plot of Height values on the x-axis versus Weight values on the y values.
pcPlot <- ggplot(data=anno)
pcPlot <- pcPlot+aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr)
pcPlot <- pcPlot+geom_point()
pcPlot$mapping
## $x
## X1000g2014oct_eur
##
## $y
## X1000g2014oct_afr
pcPlot$theme
## list()
pcPlot$layers
## [[1]]
## geom_point: na.rm = FALSE
## stat_identity: na.rm = FALSE
## position_identity
Now we have all the components of our plot, we need we can display the results.
pcPlot
## Warning: Removed 73951 rows containing missing values (geom_point).
More typically, the data and aesthetics are defined within ggplot function and geoms applied afterwards.
pcPlot <- ggplot(data=anno,
mapping=aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr))
pcPlot+geom_point()
## Warning: Removed 73951 rows containing missing values (geom_point).
As we have seen, an important element of a ggplot is the geom used. Following the specification of data, the geom describes the type of plot used.
Several geoms are available in ggplot2:-
pcPlot <- ggplot(data=anno,
mapping=aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr))
pcPlot_line <- pcPlot+geom_line()
pcPlot_line
## Warning: Removed 73665 rows containing missing values (geom_path).
pcPlot <- ggplot(data=anno,
mapping=aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr))
pcPlot_smooth <- pcPlot+geom_smooth()
pcPlot_smooth
## Warning: Removed 73951 rows containing non-finite values (stat_smooth).
pcPlot <- ggplot(data=anno,
mapping=aes(x=Func.refGene))
pcPlot_bar <- pcPlot+geom_bar()
pcPlot_bar
pcPlot_bar + coord_flip()
pcPlot <- ggplot(data=anno,
mapping=aes(x=X1000g2014oct_eur))
pcPlot_hist <- pcPlot+geom_histogram()
pcPlot_hist
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 73665 rows containing non-finite values (stat_bin).
pcPlot <- ggplot(data=anno,
mapping=aes(x=X1000g2014oct_eur))
pcPlot_density <- pcPlot+geom_density()
pcPlot_density
## Warning: Removed 73665 rows containing non-finite values (stat_density).
pcPlot <- ggplot(data=anno,
mapping=aes(x=Func.refGene,y=X1000g2014oct_eur))
pcPlot_boxplot <- pcPlot+geom_boxplot()
pcPlot_boxplot
## Warning: Removed 73665 rows containing non-finite values (stat_boxplot).
pcPlot <- ggplot(data=anno,
mapping=aes(x=Func.refGene,y=X1000g2014oct_eur))
pcPlot_violin <- pcPlot+geom_violin()
pcPlot_violin
## Warning: Removed 73665 rows containing non-finite values (stat_ydensity).
An overview of geoms and thier arguments can be found at ggplot2 documentation or within the ggplot2 cheatsheet.
In order to change the property on an aesthetic of a plot into a constant value (e.g. set colour of all points to red) we can supply the colour argument to the geom_point() function.
pcPlot <- ggplot(data=anno,
mapping=aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr))
pcPlot+geom_point(colour="red")
## Warning: Removed 73951 rows containing missing values (geom_point).
As we discussed earlier however, ggplot2 makes use of aesthetic mappings to assign variables in the data to the properties/aesthetics of the plot. This allows the properties of the plot to reflect variables in the data dynamically.
In these examples we supply additional information to the aes() function to define what information to display and how it is represented in the plot.
First we can recreate the plot we saw earlier.
pcPlot <- ggplot(data=anno,
mapping=aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr))
pcPlot+geom_point()
## Warning: Removed 73951 rows containing missing values (geom_point).
Now we can adjust the aes mapping by supplying an argument to the colour parameter in the aes function. (Note that ggplot2 accepts “color” or “colour” as parameter name)
This simple adjustment allows for identifaction of the separation between male and female measurements for height and weight.
pcPlot <- ggplot(data=anno,
mapping=aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr,colour=GT.TUMOUR))
pcPlot+geom_point()
## Warning: Removed 73951 rows containing missing values (geom_point).
Similarly the shape of points may be adjusted.
pcPlot <- ggplot(data=anno,
mapping=aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr,shape=GT.TUMOUR))
pcPlot+geom_point()
## Warning: Removed 73951 rows containing missing values (geom_point).
The aesthetic mappings may be set directly in the geom_point() function as previously when specifying red. This can allow the same ggplot object to be used by different aesethetic mappings and varying geoms
pcPlot <- ggplot(data=anno)
pcPlot+geom_point(aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr,colour=GT.TUMOUR))
## Warning: Removed 73951 rows containing missing values (geom_point).
pcPlot+geom_point(aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr,colour=GT.NORMAL))
## Warning: Removed 73951 rows containing missing values (geom_point).
pcPlot+geom_point(aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr,colour=GT.TUMOUR,shape=GT.TUMOUR,size=DP))
## Warning: Removed 73951 rows containing missing values (geom_point).
pcPlot+geom_violin(aes(x=is.na(GT.TUMOUR),y=X1000g2014oct_eur,fill=GT.TUMOUR))
## Warning: Removed 73665 rows containing non-finite values (stat_ydensity).
Again, for a comprehensive list of parameters and aesthetic mappings used in geom_type functions see the ggplot2 documentation for individual geoms by using ?geom_type
?geom_point
or visit the ggplot2 documentations pages and cheatsheet
One very useful feature of ggplot is faceting. This allows you to produce plots subset by variables in your data.
ggplot2 is not compatible with the way that plots are constructed in base R, so we cannot use techniques like mfrow=c(..) and mfrow=c(..) to split our plotting window into different cells. Fortunately, facet provides a more-intuitive way of composing a plot based on the variables in the data.
To facet our data into multiple plots we can use the facet_wrap or facet_grid function specifying the variable we split by.
The facet_grid function is well suited to splitting the data by two factors.
To split by one factor we can apply the facet_grid() function ommiting the variable before the “~”" to facet along columns in plot.
facet_grid(~Columns)
pcPlot <- ggplot(data=anno,aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr))+geom_point()
pcPlot + facet_grid(~GT.TUMOUR)
## Warning: Removed 73951 rows containing missing values (geom_point).
To split along rows in plot, the variable is placed before the “~.”.
facet_grid(Rows~.)
pcPlot <- ggplot(data=anno,aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr))+geom_point()
pcPlot + facet_grid(GT.TUMOUR~.)
## Warning: Removed 73951 rows containing missing values (geom_point).
The facet_wrap() function offers a less grid based structure but is well suited to faceting data by one variable.
For facet_wrap() we follow as similar syntax to facet_grid()
pcPlot <- ggplot(data=anno,aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr))+geom_point()
pcPlot + facet_wrap(~Chr)
## Warning: Removed 73951 rows containing missing values (geom_point).
A point to note here is that the order of the chromosomes is alphabetical rather than numeric. If we wanted a more natural order, we can re-factor the variable in the data frame to get the desired effect.
anno <- mutate(anno, Chr=factor(Chr, levels=c(1:22, "X","Y")))
pcPlot <- ggplot(data=anno,aes(x=X1000g2014oct_eur,y=X1000g2014oct_afr))+geom_point()
pcPlot + facet_wrap(~Chr)
## Warning: Removed 73951 rows containing missing values (geom_point).
## Warning: Removed 73665 rows containing non-finite values (stat_boxplot).
## Warning: Removed 73665 rows containing non-finite values (stat_boxplot).
filter the data frame appropriatelyDP) along each chromosomeIn the last plot we started to look at our data on a genome-wide scale.
ggbio package is a toolkit for producing publication-quality images from genomic dataggplot2So the first stage is to transform our data into a GenomicRanges object
library(GenomicAlignments)
## Loading required package: BiocGenerics
## Loading required package: parallel
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
##
## clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
## clusterExport, clusterMap, parApply, parCapply, parLapply,
## parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:lubridate':
##
## intersect, setdiff, union
## The following objects are masked from 'package:dplyr':
##
## combine, intersect, setdiff, union
## The following objects are masked from 'package:stats':
##
## IQR, mad, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, append, as.data.frame, cbind, colnames,
## do.call, duplicated, eval, evalq, Filter, Find, get, grep,
## grepl, intersect, is.unsorted, lapply, lengths, Map, mapply,
## match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
## Position, rank, rbind, Reduce, rownames, sapply, setdiff,
## sort, table, tapply, union, unique, unsplit
## Loading required package: S4Vectors
## Loading required package: stats4
##
## Attaching package: 'S4Vectors'
## The following objects are masked from 'package:lubridate':
##
## second, second<-
## The following objects are masked from 'package:dplyr':
##
## first, rename
## The following object is masked from 'package:tidyr':
##
## expand
## The following objects are masked from 'package:base':
##
## colMeans, colSums, expand.grid, rowMeans, rowSums
## Loading required package: IRanges
##
## Attaching package: 'IRanges'
## The following object is masked from 'package:lubridate':
##
## %within%
## The following objects are masked from 'package:dplyr':
##
## collapse, desc, regroup, slice
## Loading required package: GenomeInfoDb
## Loading required package: GenomicRanges
## Loading required package: SummarizedExperiment
## Loading required package: Biobase
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
## Loading required package: Biostrings
## Loading required package: XVector
## Loading required package: Rsamtools
##
## Attaching package: 'GenomicAlignments'
## The following object is masked from 'package:dplyr':
##
## last
anno.gr <- GRanges(paste0("chr",anno$Chr), IRanges(anno$Start,anno$End))
mcols(anno.gr) <- select(anno, -(Chr:End))
anno.gr
## GRanges object with 77702 ranges and 29 metadata columns:
## seqnames ranges strand | Ref Alt
## <Rle> <IRanges> <Rle> | <factor> <factor>
## [1] chr1 [ 10181, 10181] * | A C
## [2] chr1 [ 17222, 17222] * | A G
## [3] chr1 [ 74570, 74570] * | T A
## [4] chr1 [124583, 124583] * | C T
## [5] chr1 [231455, 231455] * | T A
## ... ... ... ... . ... ...
## [77698] chrY [59028563, 59028563] * | T C
## [77699] chrY [59028653, 59028653] * | C T
## [77700] chrY [59028833, 59028833] * | G T
## [77701] chrY [59031297, 59031297] * | A G
## [77702] chrY [59032861, 59032861] * | T G
## Func.refGene Gene.refGene GeneDetail.refGene
## <factor> <factor> <factor>
## [1] intergenic NONE,DDX11L1 dist=NONE;dist=1693
## [2] ncRNA_intronic WASH7P <NA>
## [3] intergenic OR4F5,LOC729737 dist=4562;dist=60203
## [4] intergenic OR4F5,LOC729737 dist=54575;dist=10190
## [5] intergenic LOC729737,LOC100132062 dist=90889;dist=92437
## ... ... ... ...
## [77698] intergenic NONE,SPRY3 dist=NONE;dist=71840
## [77699] intergenic NONE,SPRY3 dist=NONE;dist=71750
## [77700] intergenic NONE,SPRY3 dist=NONE;dist=71570
## [77701] intergenic NONE,SPRY3 dist=NONE;dist=69106
## [77702] intergenic NONE,SPRY3 dist=NONE;dist=67542
## ExonicFunc.refGene AAChange.refGene cytoBand gwasCatalog
## <factor> <factor> <factor> <factor>
## [1] <NA> <NA> 1p36.33 <NA>
## [2] <NA> <NA> 1p36.33 <NA>
## [3] <NA> <NA> 1p36.33 <NA>
## [4] <NA> <NA> 1p36.33 <NA>
## [5] <NA> <NA> 1p36.33 <NA>
## ... ... ... ... ...
## [77698] <NA> <NA> Yq12 <NA>
## [77699] <NA> <NA> Yq12 <NA>
## [77700] <NA> <NA> Yq12 <NA>
## [77701] <NA> <NA> Yq12 <NA>
## [77702] <NA> <NA> Yq12 <NA>
## snp129 X1000g2014oct_eur X1000g2014oct_afr CLINSIG CLNDBN
## <factor> <numeric> <numeric> <factor> <factor>
## [1] <NA> <NA> <NA> <NA> <NA>
## [2] rs2981830 <NA> <NA> <NA> <NA>
## [3] <NA> <NA> <NA> <NA> <NA>
## [4] <NA> <NA> <NA> <NA> <NA>
## [5] <NA> <NA> <NA> <NA> <NA>
## ... ... ... ... ... ...
## [77698] <NA> <NA> <NA> <NA> <NA>
## [77699] <NA> <NA> <NA> <NA> <NA>
## [77700] <NA> <NA> <NA> <NA> <NA>
## [77701] rs28461249 <NA> <NA> <NA> <NA>
## [77702] rs2380015 <NA> <NA> <NA> <NA>
## CLNACC CLNDSDB CLNDSDBID DP MP GP
## <factor> <factor> <factor> <integer> <numeric> <numeric>
## [1] <NA> <NA> <NA> 63 0.90 0.00330
## [2] <NA> <NA> <NA> 36 0.90 0.09800
## [3] <NA> <NA> <NA> 15 0.92 0.07500
## [4] <NA> <NA> <NA> 13 0.97 0.01700
## [5] <NA> <NA> <NA> 109 1.00 0.00056
## ... ... ... ... ... ... ...
## [77698] <NA> <NA> <NA> 192 0.93 1.9e-07
## [77699] <NA> <NA> <NA> 174 1.00 6.1e-04
## [77700] <NA> <NA> <NA> 183 1.00 4.0e-03
## [77701] <NA> <NA> <NA> 259 1.00 2.9e-04
## [77702] <NA> <NA> <NA> 232 1.00 1.2e-03
## TG TP SG SP DS GT.NORMAL
## <factor> <numeric> <factor> <numeric> <factor> <factor>
## [1] AA/AC 0.90 AA/AA 0.09600 character(0) 0|0
## [2] AA/AG 0.90 AG/AG 0.09800 character(0) 0|0
## [3] TT/AT 0.91 AT/AT 0.07500 character(0) 0|0
## [4] CC/TT 0.59 CC/CT 0.38000 character(0) 0|0
## [5] TT/AT 1.00 AT/TT 0.00055 character(0) 0|0
## ... ... ... ... ... ... ...
## [77698] TT/CT 0.93 TT/TT 0.06900 character(0) 0|0
## [77699] CC/CT 1.00 CT/CC 0.00061 character(0) 0|0
## [77700] GG/GT 1.00 GT/GG 0.00400 character(0) 0|0
## [77701] AA/AG 1.00 AG/AA 0.00029 character(0) 0|0
## [77702] TT/GT 1.00 GT/TT 0.00120 character(0) 0|0
## GT.TUMOUR PM.NORMAL PM.TUMOUR
## <factor> <numeric> <numeric>
## [1] 0|1 0.000 0.14
## [2] 0|1 0.067 0.48
## [3] 0|1 0.000 0.50
## [4] 1|1 0.000 0.75
## [5] 0|1 0.056 0.16
## ... ... ... ...
## [77698] 0|1 0.032 0.10
## [77699] 0|1 0.068 0.16
## [77700] 0|1 0.062 0.16
## [77701] 0|1 0.056 0.14
## [77702] 0|1 0.051 0.14
## -------
## seqinfo: 24 sequences from an unspecified genome; no seqlengths
ggbio has a convenient function called autoplot which will try and predict what visualisation we want from a particular data type.
In this case, we can plot the coordinates of a set of variants. You should recognise from the distinctive grey colour scheme that ggbio is using ggplot2 to create the plot.
The plotGrandLinear function
library(ggbio)
## Warning: replacing previous import 'ggplot2::Position' by
## 'BiocGenerics::Position' when loading 'ggbio'
## Need specific help about ggbio? try mailing
## the maintainer or visit http://tengfei.github.com/ggbio/
##
## Attaching package: 'ggbio'
## The following objects are masked from 'package:ggplot2':
##
## geom_bar, geom_rect, geom_segment, ggsave, stat_bin,
## stat_identity, xlim
plotGrandLinear(anno.gr, aes(y = DP))
## using coord:genome to parse x scale
We see the first variant is located in the gene “WASH7P”. The ggbio package will allow us to visualise where this variant is located with respect to the gene. The `Homo.sapiens’ package is
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
## Loading required package: GenomicFeatures
## Loading required package: AnnotationDbi
##
## Attaching package: 'AnnotationDbi'
## The following object is masked from 'package:dplyr':
##
## select
data(genesymbol, package="biovizBase")
wh <- genesymbol["WASH7P"]
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
autoplot(txdb,which=wh)
## Parsing transcripts...
## Parsing exons...
## Parsing cds...
## Parsing utrs...
## ------exons...
## ------cdss...
## ------introns...
## ------utr...
## aggregating...
## Done
## "gap" not in any of the valid gene feature terms "cds", "exon", "utr"
## Constructing graphics...
ggbio can also plot a GRanges object. Which is not very impressive by itself.
autoplot(anno.gr[1],which=wh)
However, we can combine plots using the tracks function
tracks(autoplot(txdb,which=wh),
autoplot(anno.gr[2])
)
## Parsing transcripts...
## Parsing exons...
## Parsing cds...
## Parsing utrs...
## ------exons...
## ------cdss...
## ------introns...
## ------utr...
## aggregating...
## Done
## "gap" not in any of the valid gene feature terms "cds", "exon", "utr"
## Constructing graphics...
ggbio is also able to plot the location of our sequencing reads.
seqlevelsStyle(wh) <- "Ensembl"
wh <- keepSeqlevels(wh,"1")
bam <- paste0(path.to.bam, "/HCC1143.mix1.n40t60.bam")
#reads <- readGAlignments(file = bam, param=ScanBamParam(which=wh),use.names = TRUE)
autoplot(bam, which=wh)
## reading in as Bamfile
## Parsing raw coverage...
## Read GAlignments from BamFile...
## extracting information...
library(BSgenome.Hsapiens.UCSC.hg19)
## Loading required package: BSgenome
## Loading required package: rtracklayer
bg <- BSgenome.Hsapiens.UCSC.hg19
autoplot(bam,bsgenome=bg, stat="mismatch",which=wh)
## reading in as Bamfile
seqlevelsStyle(txdb) <- "Ensembl"
tracks(Genes=autoplot(txdb,which=wh),
Variant=autoplot(anno.gr[2]),
Mismatch=autoplot(bam,bsgenome=bg, stat="mismatch",which=wh)
)
## Parsing transcripts...
## Parsing exons...
## Parsing cds...
## Parsing utrs...
## ------exons...
## ------cdss...
## ------introns...
## ------utr...
## aggregating...
## Done
## "gap" not in any of the valid gene feature terms "cds", "exon", "utr"
## Constructing graphics...
## reading in as Bamfile
seqlevelsStyle(txdb) <- "Ensembl"
tracks(Genes=autoplot(txdb,which=wh),
Variant=autoplot(anno.gr[2]),
Mismatch=autoplot(bam,bsgenome=bg, stat="mismatch",which=wh),
heights=c(4,1,4)
)
## Parsing transcripts...
## Parsing exons...
## Parsing cds...
## Parsing utrs...
## ------exons...
## ------cdss...
## ------introns...
## ------utr...
## aggregating...
## Done
## "gap" not in any of the valid gene feature terms "cds", "exon", "utr"
## Constructing graphics...
## reading in as Bamfile
The Gviz package was finer-control over the plotting of data as tracks.
library(Gviz)
## Loading required package: grid
vars <- AnnotationTrack(anno.gr[1:10])
gtrack <- GenomeAxisTrack()
itrack <- IdeogramTrack(genome = "hg19", chromosome = "chr1")
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
grtrack <- GeneRegionTrack(txdb)
plotTracks(list(itrack, gtrack, vars, grtrack),from=10000,to=20000,collapseTranscripts=TRUE,transcriptAnnotation="symbol")
varpos <- start(anno.gr[2])
reads <- readGAlignments(bam, use.names=TRUE,param=ScanBamParam(which = GRanges("1",IRanges(varpos-20, varpos+20)),what=c("cigar","mapq","seq","flag")))
seqlevelsStyle(reads) <- "UCSC"
altrack <- AlignmentsTrack(start=start(reads),end=end(reads),chromosome = seqnames(reads),cigar=mcols(reads)$cigar,genome="hg19")
strack <- SequenceTrack(Hsapiens, chromosome = "chr1")
plotTracks(list(itrack, vars, altrack), from = varpos-20, to = varpos+20, cex = 0.8,collapseTranscripts=TRUE)
ht <- HighlightTrack(trackList = list(altrack, strack), start=varpos-1,width=2,chromosome="chr1")
plotTracks(list(itrack,vars,ht,grtrack), from = varpos-20, to = varpos+20, cex = 0.8)
## Warning in consStr != ref: longer object length is not a multiple of
## shorter object length